
Office 



o 



1/3 




INVESTOR IN PEOPLE 



Application No: 
Claims searched: 



GB 0301304.2 
1 and 14 at least 



Examiner: 
Date of search: 



Emily McGeehin 
16 September 2003 



Patents Act 1977 : Search Report under Section 17 





Relevant 
to claims 


Identity of document and passage or figure of particular relevance 


X 


l and 14 at 
least 


GB2319346A 


SONY UK LTD 

Abstract 

Figures 5 to 8 

Page 3, lines 14 to 17 

Page 4, lines 13 to 17 and lines 23 to 24 

Page 7, lines 4 to 19 


A 

A 




EP1132720A2 


TEKTRONIX INC 

Abstract <* 

Figure 1 

Column 4, lines 1 1 to 21 


A 




JP2002-354366A 


MATSUSHITA ELECTRIC IND CO LTD 

Abstract 

Figures 2 to 6 

Translation paragraphs 0027 to 0034 


A 




US5812688A 


GIBSON 
Abstract 
Figure 2 

Column 3, lines 66 to column 4, line 58 
Column 5, lines 31 to 55 


A 




JP07-084028A 


ONO SOKK1 CO LTD 

Abstract 

Paragraph 0019 


A 




Kashine and Murase; Sound source identification for ensemble music based on music 
stream networks (Journal of Japanese Society for Artificial Intelligence, Nov 1998, vol 
13,num2, pgs 962-970) 


A 




Greuel; Sculpting 3D worlds with music; advanced texturing techniques; (Proceedings 
of SPIE, Feb 1996, vol 2653, pgs 306-315) 



Categories: 



X Document indicating lack of novelty or inventive step A Document indicating technological background and/or state of the art. 

P Document published on or after the declared priority date but before the 
filing date of this invention. 



Y Document indicating lack of inventive step if combined 
with one or more other documents of same category. 

& Member of the same patent family 



E Patent document published on or after, but with priority date earlier 
than, the filing date of this application. 



Field of Search: 

Search of GB, EP, WO & US patent documents classified in the following areas of the UKC V : 



An Executive Aeencv of the Department of Trade and Industry 



H2) 



UK Patent Application <, 91 GB ,,,,2319346 „ 3 >A 



(43) Date f A Publication 2O.06.1998 



(21) Application No 9623635.1 

(22) Date of Filing 1X11.1996 



(71) Applicants) 

Sony United Kingdom Limited 

(Incorporated in the United Kingdom) 



Hie Height*. Brooktends, WEYB FUDGE. Surrey, 
KT13 OXW, United Kingdom 

(72) Invantorfs) 

Peter Charles Easily 

(74) Agent and/or Address for Service 
D Young & Co 

21 New Fetter Lane. LONDON. EC4A IDA. 
United Kingdom 



(51) INT CL 8 

G01R 13/00 

(52) UK CL (Edition P ) 

G1UUR1300 
U1SS1940 

(56) Documents Cited 
SA1 



US 5241302 A 



(58) Reld of Search 

UK CL (Edition O > G1U UR1300 UR1302 URZ300 

UR2316 UR2317 , G5C CAE CDBK CDBX 

INT CL 6 G01R 13/00 13/02 13/26 13/40 23/00 23/16 

23/17 

OnlineWPI 



(54) Analysis of audio signals 

(57) Apparatus for analysing stereo audio signals comprises amplitude detecting means, phase detecting 
means and means of generating a colour display in which colour indications ere dependant upon the relative 
amplitudes and phase correlation of two audio signals at a time of testing. The means of generating a colour 
display may be arranged to provide colour indications for periodically successive times of testing and the 
audio signals may be subdivided into frequency bands for being tested. Frequency band, time of test and the 
colour indications relating to relative amplitude and phase information may be arranged to be displayed on a 
screen. 

frequency' 




time 



Fig. 8 



rjo 

ro 
oo 

GO 

> 



1/6 




5/6 



MM LSB 



II 1 , J 1 1 

ngnt 


left 


front 


back 



orange 


\ yellow / 


' It green 


red 


f white 


) green 


purple 


/ blue \ 


. It blue 


Fig. 7a 



front v 




back 



Fig. 7b 



6/6 



t 




2319346 

1 

ANALYSIS OF AUDIO SIGNALS 
This invention relates to analysing audio signals. 

Several techniques have been proposed for showing, on a visual display, 
5 various technical features of an audio signal. 

0» previously proposed technique is the so-called "voice print 0 . Atypical 
"voice print" represents a mocophonk sound by a two dimensional image on a 
computer screen, paper prim or cathode ray tube display. 

A horizontal axis isuaedtorepfwenttime, with the earliest on the left and the 
10 latest (or most recent) time on the right. A vertical axis is used to represent 
frequency, withltfmst frequencies at the bottom and highest at the top. 

It is usual for the vertical axis to be on a logarithmic scale, i.e. equal vertical 
distances r^resoating octave dtfhraces in frequency. The intensity of the image at 
each point represents the intensity of the sound at the appropriate frequency and time . 
15 The amplitude to intensity mapping used is usually logarithmic, i.e. changes in the 
decibel (dB) vatoe comapced to dnges in intensity. Depending upon the type of 
display used (p*er prfcftost or scwa* display) Iwder sounds nay be repiescnted by 
a darker or lighter image element. Images may be either a static "snap-shot 1 * of a 
sound over a number of seconds, or may be contianoutly generated in real time either 
20 onto a roll or paper, or scrolling across a screen. 

Voice print images have been inuae for over 40 years, maybe much longer. 
However, they naeaiseni^ If the technique is to be used 

with a stereo signal, then either a separate voice faint has to be produced for each 
channel, or the two channels have to be combined so as to produce a single audio 
25 signal whose time-dependent intensity can then be mapped onto the voice print. 
Neither of these solutions then gives any indication of the relative phase of the stereo 
channels. 

Another previously proposed technique which allows die relative phase of a 
stereo pair to be displayed graphically is the so-called "phase-scope" display. 
30 In this device the left and right parts of a stereophonic signal are displayed on 

an oscilloscope screen such that Ac left signal displaces the spot upwards along an 
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axis from the bottom right corner to the top left comer of the display and the right 
signal displaces the spot upwards along an axis from the lower left corner to the top 
right corner of the display. Given this arrangement the "phase-spot" displayed on a 
phase scope may differentiate between the following signals: 

5 

SILENCE: Stationary spot in centre of screen. 

LEFT only: Line from bottom right to top left. 

RIGHT only: Line from bottom left to top right. 

FRONT 1 : Vertical line 

10 BACK 2 : Horizontal line 

ORTHOGONAL 3 : Central elliptical/circular display. 
RANDOM 4 : "Ball of wool" central display. 



However, although the phase scope provides useful graphical information 
15 about the relative phases of the left and right channels, the "phase scope" also suffers 
from a number of disadvantages: 

In contrast to the voice print display, the phase scope display is transient and 
requires that the operator keep an eye on it whenever anything interesting happens to 
the audio signal under test. 
20 Also, the "phase scope" display works on die aggregate stereo signal, which 

is usually composed of the outputs of many instruments which have different 
directional characteristics. This makes it difficult to distinguish the directional 
information in one signal in the presence of all the others. 

This invention provides apparatus for analysing audio signals from a stereo 



25 1 Left and right of equal amplitude and in phase 

2 Left and right of equal amplitude and out of phase (here, it is appreciated that the 
term "back" is not strictly correct, but it is used throughout this description as a useful term 
to distinguish out-of-phase signals from in-phase or "front" signals). 

3 Similar sinusoidal left and right signals with 90° phase difference 
30 4 Uncorrected left and right signals 
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pair of audio chai&nels, the apparatus comprising: 

magnitude detecting means for detecting the magnitudes of the audio signals 
of the two audio channels; 

phase de&cttog mean for detecting a degree of phase correlation between die 
5 audio signals of the two audio channel*; and 

means for generating an indicator colour for display in respect of the audio 
channels at a time of test, the indicator colour having a hoe, intensity and/or 
satu rati o n depenriemt on at lust fte relative magnitudes of and the degree of phase 
correlation between the sudk> signals of the two audio rfmnmfr at the time of test. 
10 Audio analysis apparatus according to embodiments of the invention provides 

advantages of fcttth the "voice print" sod fee 'phase scope" type of display, by 
allowing the phase and intensity of two stereo audio channels to be displayed, with 
a displayed "history" showing the temporal variation of these values over a period of 
time. A tether extra feature of at least embodiments of the invention is that the 
IS information is split up by frequency band , so that phase effects occurring at particular 
frequency bands (e.g. effects arising in sound picked up from particular musical 
instruments) can easily be identified. 

The invention will now be described by way of example with reference to the 
accompanying drawings, throughout which like parts are referred to by like 
20 references, andtawNch: 

Figure 1 is a schematic illustration of an audio analysis apparatus; 

Figure 2 arh— artrtifr illustrates an input stage of the apparatus of Figure 1; 

Figure 3 schematically iUnatmtes one channel of a filtering stage of the 
apparatus of Figure 1; 

25 Figures 4a to 4d schematically illustrate filter responses of band-pass filters 

in the filtering stage of Figure 3; 

Figure 5 schematically illustrates an encoding stage of the apparatus of Figure 

1; 

Figure 6 schematically illustrates an encoded data word output by the encoding 
30 stage of Figure 5; 

Figures 7a and 7b schematically illustrate a mapping between the encoded data 
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words of Figure 6 and display colours; and 

Figure 8 is a schematic representation of a screen display generated by the 
apparatus of Figure 1. 

Figure 1 is a schematic illustration of an apparatus for audio analysis. The 
5 apparatus operates, in some respects* in a similar manner to a standard "voice print", 
but also uses display colour to display some of the phase information usually 
represented by the "phase scope". 

Referring now to Figure 1, the apparatus comprises an input stage 10 for 
receiving audio signals representing left and right audio channels, a filtering stage 20, 
10 an encoding stage 30, a mapping stage 40 and a display device 50. The function of 
each of these stages will be described in detail below with reference to the remaining 
figures. 

Figure 2 schematically illustrates the input stage 10 of the apparatus. Audio 
signals (referred to as "left signal" and "right signal") are supplied, if necessary, to 
15 respective analogue-to-digital converters 100. These are used if the input audio 
signals are in analogue form; clearly, if the input audio signals are in digital form 
already, there is no need for the analogue-to-digital converters 100. 

Digitised left and right audio signals are then supplied to respective high pass 
digital filters 110, which are arranged to pass substantially all frequencies other than 
20 a DC level to remove any DC offset generated by the'an^gate-to-digital conversion 
process (wherever in the system that occurred). In the present example, this is 
achieved by having a high pass filter with a passband of 1 Hz upwards. 

The outputs of the two high pass filters are supplied in parallel to a respective 
pair of multipliers 120 and to an absolute value detector and comparator 130. The 
25 absolute value detector generates an absolute value from the output of each high pass 
filter 110, subject to a fast attack and slow decay function. This unit then detects the 
maximum of the two absolute values for the left and right channel respectively, and 
calculates a reciprocal value from this maximum value. The reciprocal value is then 
multiplied by each of the left and right signals in the multipliers 120. In this way, 
30 the two signals are scaled by an amount dependent upon the magnitude of the larger 
of the two signals, to provide a fast attack / slow decay automatic gain control 
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(AGC). 

Figure 3 illustrates a part of one channel of die filtering stage 20 of the 
apparatus of Figure 1. 

In particular, in Figure 3 the "scaled left" signal generated by the left-channel 
5 multiplier 120 of Figure 2 is supplied in parallel to a bank: of similar (or identical) 
"Q" band-pass fibers 135. Bach of the band-pass fihars has a different frequency 
passband, as shown schematically in the sequence of Figures 4a to 4d (where 
frequency is represented on the horizontal axis and the filtering gain is represented 
on the vertical axil). The paaebandi are substantially noo-o**rtappiug, and 
10 correspond to the different frequency ranges used in the analysis display (see Figure 
8 below). 

So, the filtering stage of Figure 3 outputs, for each audio channel, a set of 
band-pass filtered signals, one from each of the band-pass titers 135. In this 
embodiment, there are 61 bend-pass fitters 135 for each audio channel - 6 per octave 
15 for ten octaves, including one at each end of the overall frequency range. 

Figure 5 schematically illustrates the encoding stage 30 of the apparatus of 
Figure 1. In feet, the components shown in Figure 5 are replicated, once for each 
pair of band-pass fitters 135 (one left, one right) of the filtering stage 20. 

The inputs to the encoding stage 30 are a band-pass filtered left signal from 
20 the f&ermg stage and the hand past tittered right signal corresponding to the same 
frequency band. 

These two signals are multiplied together in a multiplier 140 to generate a 
left*right signal. Also, die BPF left signal is squared in a multiplier 150 to generate 
a left*left signal, and the EPF right signal is squared in a multiplier 160, to generate 
25 a right *right signal. The final piece of processing to mention here is that the 
leftTight signal is also negated (by multiplying by -1) in a multiplier 170. 
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So, at this point in the discussion, the following four signals have been 
generated: 

ieft*Ieft 
right*right 
5 left*right 

-left*right. 

The Ieft*left signal is a good indicator of the LEFT signal amplitude, similarly 
the right*right signal is a good indicator of the RIGHT signal amplitude. As 
described below, a positive peak-following envelope detector (a circuit which tracks 

10 and holds, with a defined decay rate, positive-going peaks in the signal) is used so 
that transient peak effects can be observed. 

The left*right signal is largely positive when the left and right signals are in 
phase (a FRONT signal) and is negative when the left and right signals are out of 
phase (a BACK signal). Thus in this embodiment a positive peak detector is applied 

15 to this (teft*right signal) to give a good indicator of the FRONT signal, and a positive 
peak detector is applied to the -left*right signal to give a good indicator of the BACK 
signal. 

Bearing this in mind, each of the four signals mentioned above is processed 
by a respective positive peak detector 180 comprising a scaled multiplication stage, 
20 a TnaTimtim detector and a delay element. 

Each of the four output signals from the positive peak detectors is supplied to 
a respective linear-to-logarithmic converter 190 and from there to a bit shifter 200. 

Referring to Figure 5 and to Figure 6, the bit shifters 200 shift the numerical 
outputs of each of the logarithmic converters 190 by differing amounts so that when 
25 the four resulting bit-shifted values are added by a cascade of adders to 10, the back, 
front, left and right signals occupy different bits in a single encoded data word, as 
shown in Figure 6. 

In particular, assuming (in this example) that a 16-bit encoded data word is 
used, the right signal occupies the four most significant bits, followed by the left 
30 signal, the front signal and finally the back signal occupying the four least significant 
bits. 
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This encoding process is not of count essential, but is used simply to provide 
a convenient data transport technique between tbe encoding stage 30 and the mapping 
stage 40. 

In the mapping stage 40, the FRONT, BACK, LEFT and RIGHT signals 
5 generaaed (for each respective filter pass-band) at the encoding stage are mapped into 
respective display colours. 

A nuu a bcr ef mappings from filtered stereo signal to intensity, hue and 
aaiBrahonassspoaaabie. Wiwwwr, «" «frw crnbodanent, *»» order to retain c CTpufibiKty 
with a a*morAonk: "voice print- the logarithm of the intensity of the stereo signal is 

10 mapped directly to the inflnaaily of the resulting display. The saturation of the display 
(the "amount" of colour) is controlled by the amount of dfeecttonaffiy in the stereo 
signal. Substantially ORTHOGONAL and RANDOM signals (henceforth referred 
to S i mm k a lly aa CENTRE signals) give aero or very low sanit a tio n (producing black, 
gtey or white), whereas strongly LEFT, RIGHT, FRONT or BACK signals produce 

15 highly saturated colours. 

The selection of which hue to use for which direction is open to multiple 
interpretations. In some traditional stereo level meters the convention red = left, 
green =» right is used, as this coincides with port and s t ar bo ar d navigation lights used 
in nautical and aeronautical applications. 

20 Using this as a starting point for LEFT and RIGHT, the FRONT and BACK 

signals remain to be assigned. In use the FRONT colour should appear to be in some 
senae a "intone" between the LEFT and RIGHT colours (since in audio terms it 
actually is produced by a mixture of the left and right signals), whereas the BACK 
colour should be distinct (since it is important that unintentionally produced BACK 

25 elements in die stereo signal be quickly identified). This embodiment uses yellow for 
the FRONT signal (which is actually made from red and green in most modern 
displays) and BLUE for the BACK signal (which is easily distinguished from the 
other two primaries, red and green). 
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The combined hue and saturation mappings may be seen thus: 



FRONT 



LEFT CENTRE RIGHT 



BACK 



YELLOW 



RED 



WHITE GREEN 



BLUE 



This mapping is quasi-continuous, within the context of the four-bit 
quantisation applied to the left, right, front and back signals, in that intermediate 
values on the LEFT, RIGHT, FRONT, BACK map are translated into intermediate 
colours on the hue/saturation map. 

10 There are several possible methods of converting the left and right filter output 

signals into an intensity and a position on the LEFT, RIGHT, FRONT, BACK map. 
One such method is that once the LEFT, RIGHT, FRONT, BACK signals (teft*left, 
right*right, left*right and -left*right) have been generated, an H (horizontal) signal 
is produced from (RIGHT - LEFT), and a V (vertical) signal is produced from 

15 (FRONT - BACK). These H and V signals are used to select the appropriate position 
in a read-only memory (ROM) containing the hue/saturation map table. 

This process is illustrated schematically in Figures 7a and 7b. Figure 7a 
illustrates the hue/saturation table with reference to the four signal directions 
(FRONT, BACK, LEFT, RIGHT) of Figure 7b, as transformed to the H and V axes. 

20 The ROM containing the table is simply a look-up table of palette values to drive a 
video display, so that for each pair or (H, V) values (used as ROM addresses), a 
respective set of hue, saturation and intensity values is defined and stored at that 
address. 

The mapping operation is performed repeatedly for each frequency band 
25 (defined by each left-right pair of band-pass filters). Starting at the centre of the 
mapping table (for CENTRE signals) the hue is white and the saturation is very low. 
The intensity which is displayed increases with increasing amplitude of the CENTRE 
signal. Moving away from the centre in a particular direction defines a hue other 
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dun while (e.g. orange), and the displacement from the centre defines the saturation - 
further from the centre of the table corresponding to higher saturation. The intensity 
of the colour to be displayed for that frequency bead » proportion to or otherwise 
dependent on the amplitndc of the signal in that frequency bond. 
5 The amplitude (for use in mapping an intensity value) could be measured as, 

for example, the sum of the two signals left + right. 

Hje mapping is performed for each of tbe frequency bands, at regularly 
spaced time intervals, using, on each occasion, the latest available signal values from 
the encoding stage SO. The resulting colour (hue, saturation, intensity) for each 
10 frequency band is then displayed at a vertical screen position dependent on the 
frequaney of that band, and at a horizonta l position dependent on the date at which 
tte marring took place. 

Figure 8 is a schematic representation of a screen display generated by the 
apparatus of Figure 1. 

15 In Figure 8, time is represen ted along a horizontal axis from left (least recent) 

to right (most recent). Frequency is represented along a vertical axis, split into 61 
frequency bands. In reality, there would be one band for each left-right pair of band- 
pass filters 135, but for simplicity of the drawing only a relatively small number of 
bands are illustrated. 

20 Some areas are shown shaded in variousYshitdes of grey. Within the formal 

restrictions placed on patent drawings, these shades are intended to repre sent the 
different hues, intensities and saturations assigned to those frequency bands at those 
times by the filtering and encoding Ma ges 

So, following along the same horizontal level of the display from left to right, 

25 it is possible to see the signal content and channel phase at a particular frequency 
band, over time. (Often a particular frequency band or small group of bands might 
contain mainly signals from a particular sound source - such as a drum at low 
frequencies or a trumpet over an octave or so at relatively high frequencies). 
Looking in the vertical direction at a particular time instant (a particular point on the 

30 horizontal axis) it is possible to see where in the frequency spectrum the audio energy 
is concentrated, and the relative phase of the two channels at each frequency. 
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CLAIMS 

1 . Apparatus for analysing audio signals from a stereo pair of audio channels, the 
apparatus comprising: 

5 magnitude detecting means for detecting the magnitudes of the audio signals 

of the two audio channels; 

phase detecting means for detecting a degree of phase correlation between the 
audio signals of the two audio channels; and 

means for generating an indicator colour for display in respect of the audio 
10 channels at a time of test, the indicator colour having a hue, intensity and/or 
saturation dependent on at least die relative magnitudes of and the degree of phase 
correlation between the audio signals of the two audio channels at the time of test. 

2 . Apparatus according to claim 1 , in which the apparatus is operable to generate 
15 successive indicator colours at periodically successive times of test. 

3. Apparatus according to claim 2, comprising: for each audio channel, one or 
more filters for filtering the audio signal of that channel into two or more frequency 
bands; and in which: 

20 the magnitude detecting means detects the magnitudes of corresponding pairs 

of frequency bands from die two channels; 

the phase detecting means detects the degree of phase correlation between 
pairs of frequency bands from the two channels; and 

the means for generating an indicator colour generates a respective indicator 
25 colours, for display in respect of a frequency band of the audio channels at a time of 
test. 

4. Apparatus according to claim 3 , comprising means for displaying the indicator 
colours on a display screen, each indicator colour being displayed at a screen position 

30 dependent on the time of test and the frequency band for which that display colour 
was generated. 
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5. Apparatus according to claim 4, in which the screen position for display of an 
indicator colour has a horizontal screen position dependent on the time of test and a 
vertical setae* position dependent on the rmftency band in respect of which that 
indicator colour was generated. 

6. Apparatus according to any one of the preceding claims, in which the phase 
detecting mease Boasprim means for detecting me «wgnHrf» of a sum of die audio 
stgnalsofthetwoaudtocham^aedttem 

i of the two i 



7. Apparaaa for andysag audio signals, the appafams being substantially as 
hereinbefore described with reference to the accompanying drawings. 
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